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Abstract 

Motivated by the 'subgraphs world' view of the ferromagnetic Ising 
model, we develop a general approach to studying mixing times of Glauber 
dynamics based on subset expansion expressions for a class of graph poly- 
nomials. With a canonical paths argument, we demonstrate that the 
chains defined within this framework mix rapidly upon graphs of bounded 
tree-width. This extends known results on rapid mixing for the Tutte 
polynomial, the adjacency-rank (i?2-)polynomial and the interlace poly- 
nomial. 

Keywords: Markov chain Monte Carlo, subset expansion, graph poly- 
nomials, tree-width, canonical paths, Tutte polynomial, interlace polyno- 
mial, random cluster model. 

1 Introduction 

We analyse a subset-sampling Markov chain on simple graphs that is derived 
from certain graph functions — usually, in fact, graph polynomials. We show 
that this chain mixes rapidly on graphs of constant tree-width. 

Throughout the paper, the graph functions V we consider are formulated 
using subset expansion^. An edge subset expansion formula for V is written as 
follows: for any simple graph G = (V,E), 

V(G)= $X(V,S)) (1) 

SCE 

for some graph function w, where (V, S) denotes the graph with vertex set V 
and edge set S. If the function w is non-negative, that is, w(G) > for all 
graphs G, we refer to ([T]) as an edge subset weighting for V and to w as its 
weight function. In fact, we shall need the weight function to be positive on 
all subgraphs — from a statistical physics viewpoint, this results in a so-called 
'soft-core model'. 

Before moving on, let us anchor the general formula ([1]) with an example that 
is prominent in statistical physics, theoretical computer science, and discrete 
probability. The partition function of the random cluster model can be denned 



x The term 'subset expansion' was coined by Gordon and Traldi 28 , though it is a special 
type of 'states model expansion' which is commonly used in physics. 
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for any G = (V, E) and parameters q, /i as 



Z RC (G;q^) :^^V |S| , 



(2) 



SC_B 



where k(S) is the number of components in (V, S). For more on the random clus- 
ter model, see an extensive treatise by Grimmett [3U]. Notice that, if g,/i > 0, 
then w((V, S)) :— g K ( s )^l s 'l provides an edge subset weighting for Zrc{G] 1, m)- 
Under a suitable transformation, Zrc{G] q, (J.) is equivalent to the Tutte poly- 
nomial |57) . defined for any G — (V, E) and parameters x, y as 



where r(S) is the F2-rank of the incidence matrix for (V, S). A wealth of com- 
binatorial and structural information can be obtained from evaluations of this 
function. Indeed, this polynomial has a remarkable universality property, which 
informally speaking says that it subsumes any graph invariant that can be com- 
puted by deletion and contraction of edges [50], cf. [55]. In addition, the Tutte 
polynomial specialises to several key univariate graph polynomials, including 
the chromatic polynomial of Birkhoff [7 . It specialises to the Jones polyno- 
mial in knot theory [38]. By its connection with the random cluster model, it 
also generalises the partition functions of the Ising [33] and Potts [5T] modelfl 
Consult the monograph of Welsh J58 for more on these crucial connections. In 
addition to Znc{G;q, (J,) and T(G;x,y), we shall highlight a few other specific 
polynomials from the literature, but for a broad account of the development of 
graph polynomials, consult the recent surveys by Makowsky (43) and by Ellis- 
Monaghan and Merino [191I20] . 

It was shown in 1990 by Jaeger, Vertigan and Welsh [33] that, in general, for 
fixed (rational) values of x and y, the evaluation of T(G; x, y) is #P-hard, except 
on a few special points and curves in the (x, y)-plane. As a result, there have 
been substantial efforts since then to pin down the approximation complexity of 
computing T(G; x, y). For large swaths of the (x, y)-plane, it is now known that 
the computation of T{G; x, y) either does not admit a fully polynomial-time ran- 
domised approximation scheme (FPRAS) unless RP = NP, or is at least as hard 
as #BIS (the problem of counting independent sets in bipartite graphs) under 
approximation-preserving reductions, cf. Goldberg and Jerrum [26] . The sole 
positive approximation result applicable to general graphs is the breakthrough 
FPRAS by Jerrum and Sinclair [3j)J [37] for the partition function of the fer- 
romagnetic Ising model — this corresponds to computation of T(G; x, y) along 
the portion of the parabola (x — l)(y — 1) = 2 with y > 1. Various approaches 
have given efficient approximations in some regions of the Tutte plane for spe- 
cific classes of graphs — cf. e.g. Alon, Frieze and Welsh [2[3], Karger [39l 140] . 
and Bordewich [13] , To obtain their seminal result, Jerrum and Sinclair used 
a Markov chain Monte Carlo (MCMC) method, a principal tool in the design 
of efficient approximation schemes for counting problems. MCMC methods are 
widespread in computational physics, computational biology, machine learning, 
and statistics. There have been steady advances in our understanding of such 

2 If x, y > 1 or q, fi > 0, then, respectively, T(G;x,y) or Zjic(G;q, n) generalise the parti- 
tion functions of the ferromagnetic Ising and Potts models. 



T(G;x,y) := £ (* - l)^-^ - Ijl^l-W 



(3) 



SCE 
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random processes and in showing how quickly they produce good approxima- 
tions of useful probability distributions in huge, complex data sets. See the 
lecture notes of Jerrum |35] or a survey by Randall |52j for an overview of the 
application of these techniques in theoretical computer science. 

We postpone the precise statement of our main result, Theorem [TJ as it 
requires a host of definitions, but here we give a cursory description. In this 
paper, we are interested in the rate of convergence to stationarity of a natural 
Markov chain closely associated to a subset weighting of V (of form ([T])), when 
some mild restriction is placed upon the weight function w. That restriction 

— which we have dubbed A- multiplicative — is described in Subsection [2Tj for 
now, we remark that some important graph polynomials and partition functions 
from statistical physics (e.g. Znc{G;q, /x) and T(G;x,y)) obey it. The state 
space of our chain is the set of all edge subsets, upon which we have set up a 
MCMC method using Glauber dynamics 24 . Each possible transition in the 
chain is either the addition or deletion of exactly one edge to/from the subset 
and the transition probabilities are defined according to the weights w((V,S)), 
subject to a Metropolis-Hastings filter [3PJ 146^1 Our main finding is that on 
graphs of bounded tree-width this Markov chain converges to the stationary 
distribution in time that is polynomial in the number of vertices of the graph. 

Our approach is inspired in part by the 'subgraphs world' in which Jer- 
rum and Sinclair [36] [37] designed their FPRAS for the partition function of 
the ferromagnetic Ising model. It is also motivated by recent work of Ge and 
Stefankovic [551 [53] , who introduced the i?2-polynomial in an attempt to de- 
vise a FPRAS for ^BIS. Their adjacency-rank polynomial is defined for any 
G = (V, E) and parameters q, [i as 

fi 2 (G; W ):=^^ s V S| , (4) 

SCE 

where rkziS) is the F2-rank of the adjacency matrix for (V, S). Using a combina- 
torial interpretation of rk2 applicable only to bipartite graphs, they showed that 
the edge subset Glauber dynamics (using the weighting in ^) mixes rapidly 
on trees. They conjectured that the chain mixes rapidly on all bipartite graphs, 
cf. Conjecture 1 in [52]. In addition, Ge and Stefankovic showed that the Markov 
chain for the (soft-core) random cluster model — i.e. weighted according to ([2]) 

— mixes rapidly upon graphs of bounded tree-width. We have extended both 
of these results under a unified framework. In particular, we show that the 
i?2-polynomial fits in our framework without recourse to the combinatorial 
interpretation for bipartite graphs, and hence that the Markov chain for the 
i?2-polynomial mixes rapidly upon all graphs of bounded tree-width. We also 
remark here that the conjectured rapid mixing of this chain on all bipartite 
graphs was disproved by Goldberg and Jerrum [25] . 

The polynomials and Markov chains that we capture in our framework are 
defined for any graph; however, we obtain rapid mixing results only on graphs of 
constant tree-width. For brevity, we will not define tree-width here, but merely 
say that it is an essential concept in structural graph theory and parameterised 
complexity — see modern surveys on the topic by Bodlaender [12j and Hlincny 

3 A Metropolis-Hastings filter is applied in order to ensure that the resulting process is a 
reversible Markov chain and thus guaranteed to converge to a unique stationary distribution 
with state probabilities proportional to the weight. 
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et al. [21] . The restriction of tree- width is commonly used in graph algorithms 
to reduce the complexity of a computationally difficult problem, usually by way 
of dynamic programming. For example, it is already known that many of the 
polynomials covered here can be evaluated efficiently for graphs of bounded 
tree-width. Independently, Andrzejak [4] and Noble [47] exhibited polynomial- 
time algorithms to compute the Tutte polynomial of graphs with bounded tree- 
width. Works of Makowsky and Marino [33] and Noble [3H] have significantly 
generalised this, in the former case, to a wide array of polynomials under the 
framework of monadic second order logic (MSOL), and, in the latter case, to 
the so-called [/-polynomial [49], a polynomial that includes not only the Tutte 
polynomial but also a powerful type of knot invariant as a special case. 

Even though many of the polynomials we refer to can be computed exactly 
in polynomial time for graphs of bounded tree-width, it remains of interest to 
show that the associated Glauber dynamics is rapidly mixing. One hope is that 
for some polynomials the chain mixes rapidly for a wider class of graphs. There 
have been significant and concerted endeavours by researchers spanning physics, 
computer science and probability to determine the mixing properties of Glauber 
dynamics on many related Markov chains. Spin systems have been of particular 
interest; indeed, the main thrust of the work of Jerrum and Sinclair |36[ 137] 
was to tackle the partition function for the 'spins world' of the ferromagnetic 
Ising model (using a translation to the rapidly mixing 'subgraphs world'). For 
more on the connections among the 'spins world', the 'subgraphs world' and the 
'random cluster world', see the recent work of Huber [52]. We note that many 
recent projects on spin systems have been restricted to trees or tree-like graphs, 

cf. e.g. IS1H51HH1EZ1I23G2]- 

Our primary focus in this paper is to establish results for polynomials defined 
according to edge subset expansion, but we can also extend our methodology to 
polynomials defined according to vertex subset expansion, which may be viewed 
as the 'induced subgraphs world'. To our knowledge, this form of Markov chain 
has not been greatly examined, but it handles one important graph polynomial 
that was recently introduced by Arratia, Bollobas and Sorkin [5]: the bivariate 
interlace polynomial, defined for any graph G = (V, E) and parameters x, y as 

q(G;x,y):= £ (x - iy^ s \y - 1)W\-*MS) ) (5) 
scv 

where r\z2(S) is the F2-rank of the adjacency matrix for G[S]. This polynomial 
specialises to the independence polynomial and is intimately related to Martin 
polynomials pQ. Just as for the Tutte polynomial, computation of the bivari- 
ate interlace polynomial is ^P-hard in almost the entire plane, as was shown 
by Blaser and Hoffmann 9 . The multivariate interlace polynomial, a gener- 
alisation of the interlace polynomial, can be evaluated efficiently for graphs of 
bounded tree- width, cf. Courcelle [T3] and Blaser and Hoffmann [TD1 [5] ■ Sub- 
ject to a condition on the vertex subset weightings, which we have called vertex 
X-multiplicativity, we can establish rapid mixing for vertex subset Glauber dy- 
namics on graphs of constant tree-width. 

For all of our results, we need that the weight function is strictly positive for 
all (induced) subgraphs. Many of the classical enumeration polynomials such 
as the matching, independence, clique and chromatic polynomials are captured 
by the general polynomials that we mention as examples throughout this work. 
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However, these are 'hard-core models', in which some (induced) subgraphs have 
a zero weighting, and hence are not included in our approach. Many of these 
are evaluations that fall at the boundary of the regions that we can handle. For 
example, the Tutte polynomial evaluated at the point (2, 1) counts the number 
of forests of the graph. We have shown rapid mixing at all fixed points (2, 1 + 8), 
for 8 > 0, with a mixing time that depends on 8. It would be interesting to 
consider whether the chains associated with these boundary points mix rapidly 
for graphs of bounded tree- width. 

The structure of this paper is as follows. In the next section, we give the 
definitions that are necessary for a detailed description of the main theorem. We 
give the main theorem in Section [3] and then indicate some of its consequences. 
We present the proofs in Section In Section [5j we extend our results to 
Glauber dynamics on vertex subsets, that is, on induced subgraphs. 

2 Definitions 

2.1 A-multiplicative weight functions 

In this subsection, we describe the condition we require on our graph functions 
V . This condition prescribes that the weight function is multiplicative with 
respect to the operation of disjoint graph union as well as "nearly multiplicative" 
with respect to the operation of composition via small vertex cuts. 

We use the notation A := max{A, 1/A}. For a graph G — (V,E), a vertex 
cut K is said to separate sets V\ and V2 if (Vi,K,V-2) is a partition of V and 
there is no edge of E that is incident to both a vertex of V\ and a vertex of V2 ■ 
A partition (Ei, E2) of E is appropriate (for K) if E\ has no edge adjacent to 
a vertex in V2 and E2 has no edge adjacent to a vertex in V\. 

For fixed A > 0, we say that the weight function w is X-multiplicative, if 
for any G = (V,E), any vertex cut K that separates sets V\ and V2, and any 
appropriate partition (£^i,£^)j we have 

~ x -\k\ < w((V 1 UK 7 E 1 ))w((V2DK 7 E 2 )) < ^ 
w(G) 

As mentioned above, if w is A-multiplicative, then it follows that w is multi- 
plicative with respect to disjoint union (by taking K = 0); furthermore, taking 
V2 = implies that the addition or deletion of a few edges in the graph does 
not change w wildly. 

2.2 Examples of valid polynomials 

In this subsection, we emphasise specific examples of edge subset weightings 
and justify that their weight functions are A-multiplicative. 

Let G = (V, E) be any graph, K be any vertex cut that separates vertex 
subsets Vi and V2, and (^1,^2) be any appropriate partition. We define G' 
to be the disjoint union of graphs (Vi U K,E\) and (V2 U K,E 2 ). We could 
imagine forming G' from G by splitting each vertex in K, taking incident edges 
in Ei with one copy of the vertex and those in E2 with the other. It is trivial 
to verify multiplicativity with respect to disjoint union for each of the weight 
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functions considered below. Therefore, to establish A-multiplicativity for these 
weight functions, it will suffice to verify that A~' 7< < w(G')/w(G) < A' K '. 

First, we observe that the partition function of the random cluster model for 
<7,/i > satisfies the condition. Recalling @, the relevant weight function is 
w((V, S)) := g K(5 V |5 '- To handle the factor, note that the graphs G and G' 
have the same number of edges. For the factor, the number of components 
in G' can be at most k(G) + \K\ since G' can be obtained by splitting \K\ vertices 
of G. Thus, w is A-multiplicative if we take A := q. 

This can also be seen in the context of the Tutte polynomial when x, y > 1. 
Recalling ([3]), the relevant weight function is w((V, S)) := (x — l) r ( £, )~ r ( s ')(y — 
^|S|-r(S)_ a s before, it is easy to take care of the (x - l) r(E \y - l)^ factor. 
For the remaining ((x - 1)0 - l))-''( s ) factor, it is enough to observe that 
the incidence matrix of G may be obtained from the incidence matrix of G" as 
follows. The matrix for G' has two rows for each of the vertices in K, one from 
(Vi U K,Ei) and one from (y 2 U K, E 2 ). If we replace one of these two rows 
with the sum of the two rows, we do not alter the rank; if we then delete the 
other of the two rows, we change the rank by at most 1. Doing this for each 
vertex in K , we obtain the incidence matrix for G, at a total change in the rank 
r of the incidence matrix of at most \K\. Thus, w is A-multiplicative if we take 
X:=(x-l)(y-l). 

Next, we see that the adjacency-rank polynomial of Ge and Stefankovic [22 
satisfies the condition if q, /d > 0. Recalling the relevant weight function 
is w((V,S)) := <7 rk2 ( s ) . As before, it is simple to handle the /J 5 ' factor. 
For the g rk2 ( s ) factor, we note that the adjacency matrix of G may be formed 
from the adjacency matrix oi G' by \K\ row additions, followed by \K | column 
additions and finally the deletion of \K\ rows and \K\ columns. Since we must 
delete both rows and columns, the rank rk 2 of the adjacency matrix may change 
by up to 2\K\. Thus, in this case, w is A-multiplicative when taking A := q 2 . 

Now, consider the multivariate Tutte polynomial as formulated by Sokal |54| . 
defined for any graph G = (V, E) and parameters q, v = {v e } e£ E by 

Z Tutte (G;q,v):= $> k(S) IJ w - (7) 

SCE eeS 

Under this expansion, w := q K ( 5 ') IleeS Ve ^ s an e dg e subset weight function if 
q > and v e > for any e £ E are fixed. We can handle the g K ( s ) factor as 
we did for the random cluster model partition function. For the rieeS Ve f ac tor, 
observe that G and G 1 have the same set of edges. Thus, w is A-multiplicative 
when taking A := q. 

Last, we discuss the U -polynomial of Noble and Welsh [33], defined for any 
graph G — (V, E) and parameters y,x = {xj}]^ by 

U(G;x,y):=J2(y-^ Sl - HS) I[^ K{i ' S \ (8) 

SCE i=l 

where n(i, S) denotes the number of components of order i in (V, S). If y > 1 
and x, > for all i, then w((V, S)) := (y - l)l s l~ r ( s ) J]i=i x t K ^ gives an 
edge subset weig hting. The (y - l)\s\-r(S) factor 

can be handled as above. For 
the ni=i £i K ^' S ^ factor, observe that J2i G) — is at most 3|.ftT|, 
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since, if we obtain G by splitting the vertices in K, each time we split a vertex 
we either change the size of a single component or split a single component 
into two smaller components. Thus, taking x' :— max^ max{sj, x~ x } and y' := 
max{y— 1, (y — l) -1 }, we see that w is A-multiplicative when taking A := y'x' 3 . 

2.3 Glauber dynamics for edge subsets 

In this subsection, we define the Markov chain associated with the edge subset 
expansion formula for V '. From the formulation in (fTJ), the single bond flip chain 
M on a given graph G — (V, E) is defined as follows. We start with an arbitrary 
subset Xo C E and repeatedly generate X t +\ from X t by running the following 
experiment. 

1. Pick an edge e £ E uniformly at random and let S = X t © {e}. 

2. Set X t+1 = S with probability f min {1, w((V, S))/w((V, X t ))} and with 
the remaining probability set X t +i = X t . 

By convention, we denote the state space of M. by (i.e. ft — 2 E ) and its 
transition probability matrix by P. With standard arguments, it can be shown 
that M. is a reversible Markov chain that has a unique stationary distribution 
7r satisfying tt(S) oc w((V,S)). Hence, we may use M as a Markov chain in 
MCMC sampling for the following problem. 

PWE(P): P-weighted Edge Subsets 
Input: a graph G = (V, E). 

Output: a subset S CE with probability w((V, S))/V(G). 

The term rapidly mixing applies to a Markov chain that quickly converges 
to its stationary distribution. We make this precise here. The total variation 
distance \\v — v'\\tv between two probability distributions v and v' is defined by 
||^ — ^'||tv = \ J2nen W(H) — V '{H)\. For e > 0, the mixing time of a Markov 
chain A4 (with state space fi, transition matrix P and stationary distribution 
7r) is defined as 

r(e) := max{min{i | \\P\H, ■) - n(-)\\ TV < e}}. 

In this paper, we shall say that a chain M. mixes rapidly if, for any fixed e, r(e) 
is (upper) bounded by a polynomial in the number of vertices of the input graph. 
This definition for rapid mixing is the one more commonly used in theoretical 
computer science, whereas often in statistical physics or discrete probability a 
stricter 0(n log n) bound is mandated. 

3 Results 

We are now prepared to state the main theorem. 

Theorem 1. Let G — (V,E) where \V\ — n. If w is X-multiplicative for some 
A > 0, then the mixing time of M. on G satisfies 

r{e) = O (V+4(tw(G)+i)|logA| log(l/e)) 
(where tw(G) denotes the tree-width of G). 
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In Subsection 12. 2[ we noted some examples of polynomials that have A- 
multiplicative weight functions; thus, Theorem Q] implies that each of their as- 
sociated Glauber dynamics on edge subsets is rapidly mixing upon graphs of 
bounded tree-width. 

Corollary 2. Let G = (V,E) where \V\ — n. In the following list, we state 
conditions on the parameters which guarantee rapid mixing of the single bond 
flip chain on G associated with the stated polynomial and weighting. We also 
state the mixing time bound. 

1. For fixed q, /i > and the weighting (J2J) of Zrc(G', q, fi), the mixing time 
satisfies 

r(e) = O (V+4(tw(G)+i)|log g | log(l/e)) . 

Equivalently, for fixed x,y > 1 and the weighting ^ of T(G]x,y), the 
mixing time satisfies 




2. For fixed q, fj, > and the weighting ((4]) of i?2(G;<j, n), the mixing time 
satisfies 




3. For fixed q > and v e > for all e and the weighting Q of Z(G; q, v), 
the mixing time satisfies 




4- For fixed y > 1 and Xi > for all i and the weighting §E§ of U(G;x, fi), 
the mixing time satisfies 

r{e) = O (V+4(tw( G ) + i)|iog(, v3 )| log(l/e)) 

where x' — maxiinaxfj:,,^ 1 } and y' = max{y — 1, (y — l) -1 }. 

Here, we remark that Ge and Stefankovic obtained part 1 above and showed 
part 2 above in the special case of trees. Parts 2-4 directly extend these findings, 
and our main theorem considerably broadens the scope of mixing time bounds 
for subset Glauber dynamics on graphs of bounded tree-width. 

4 Proofs 

Let us first give an outline of the proof. 

Although our main result is stated in terms of tree-width, we do not treat 
tree-width directly but instead use linear-width, a more restrictive width pa- 
rameter introduced by Thomas [56] . This strategy was also employed by Ge 
and Stefankovic in the two specific cases mentioned above. For any graph 
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G = (V,E), an ordering (ei,...,e m ) of E has linear- width at most t, if, for 
each i G {1, . . . , m}, there are at most i vertices that are incident to both an 
edge in {ei, . . . , e^-i} and an edge in {ei, . . . , e m }. The linear-width lw(G) of 
G = (V, E) is the smallest integer I such that there is an ordering of E with 
linear-width at most t. The motive for using linear-width is that it implies an 
ordering of the edges which we can then use to define canonical paths between 
pairs of edge subsets. Then we show that A-multiplicativity is the general con- 
dition under which we can bound the congestion of these canonical paths. The 
use of canonical paths is a standard technique for obtaining a bound on the 
mixing time of MCMC methods — see the lecture notes of Jerrum [35] for an 
expository account of this approach. 

The key property we require that relates the linear-width of G to the more 
well-studied parameters path-width pw(G) and tree-width tw(G) of G is the 
following set of inequalities, details of which can be found in Bodlaender [IT] . 
Chung and Seymour [TJ] , Fomin and Thilikos [21] , Ge and Stefankovic [22] , and 
Korach and Solel [42] . For any graph G on n vertices, 

pw(G) < lw(G) < pw(G) + 1 < (tw(G) + l)(|log 2 n\ + 1) + 1. (9) 

We follow a canonical paths strategy to bound the mixing time of M. . Given 
G = (V, E), let cr = (ei, . . . , e m ) be an ordering of E. Given I, F G O, let I © F 
denote the symmetric difference of / and F, let a[I ®F] := (e^, . . . , e^) denote 
the restriction of a to I © F (that is, {e^, . . . , e^} = / © F and i\ < ■ ■ ■ < ik), 
and let "/ a ,i^F denote the canonical path from / to F, defined as 

Jcrj-tF '■= (Ho, ■ ■ ■ , Hk), 

where H = I, Hj = Hj-i © {e^} for all j G {1, . . . , k} (and hence Hk = F). 
Let r ff = { laJ ^ F | I,F G n>. 

To bound the mixing time of A4, we will, for some appropriately chosen tr, 
bound the congestion g(T a ) of the canonical paths, which is defined by 



eOV):= \ <H) p {HtHI) E <^(F)h^\\, (10) 



P(H,H')>0 



where |7<jj^_f| denotes the length of the path ja-,i^F- The mixing time can then 
be bounded using the following inequality of Sinclair |53j — see also Diaconis 
and Stroock [T7]: for any set T of canonical paths, 

r( £ )<max{,(r).(log-^ + logi)}. (11) 

The remainder of the section is devoted to showing the following. 

Theorem 3. Suppose G = (V, E) has linear-width £ and let a = (ei, . . . , e m ) be 
an ordering of E with linear-width at most £. If w is X- multiplicative for some 
A > 0, then g{T a ) < 2m 2 \ u . 

Theorem [3] immediately implies a good mixing time bound for the Markov chain 
M. and hence Theorem [T] follows. 
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Corollary 4. Let G = {V,E) where \E\ — m. If w is X- multiplicative for some 
A > ; then the mixing time of M. on G satisfies 

t(e) = o(m 2 A 4 lw(G) log(l/e)) . 

Proof. Substitute the congestion bound of Theorem [3] into the inequality ([TTjl . 

□ 

Proof of Theorem^ Substitute the upper bound on lw(G) in ([5]) into Corol- 
lary H □ 

In the proof of Theorem [3l we will need the following lemma. 

Lemma 5. Suppose G = (V, E) has linear-width £ and let a = (ei, . . . , e m ) be 
an ordering of E with linear-width at most £. Suppose I,F G f2 and H is on 
laj-^F- If w is X- multiplicative for some A > 0, then 

w((V,I))w((V,F)) %4t 
w((V,H))w((V,C)) ~ ' 

where C — I F H . 

Proof. Since H is on "f a j^F, we may assume that H = Hj for some j 6 
{0, . . . , k}. Let Q = {ei, .'. ..e^} and ~Q = E\ Q. Then 

H = (FnQ) U (InQ) and C = (I n Q) u (F n Q). (12) 

We can partition V into three sets as follows. Let V\ denote the set of vertices 
that are incident only to edges in Q; let V% denote the set of vertices that are 
incident only to edges in Q; let K denote the set of remaining vertices, that is, 
the set of vertices incident to edges in Q and Q. Note that \K\ is at most the 
linear- width I. 

No vertex v\ of V\ is adjacent to a vertex v-x of V2, as otherwise the edge 
between them would simultaneously be in Q and Q. This implies that K is a 
vertex cut separating V\ and V% with respect to G, and also with respect to the 
graphs (VJ^JY^F), (V,H),(V,C). Furthermore, (InQ, InQ), (FnQ, FnQ), 
(H fl Q, H fl Q), (C n Q, C n Q) are edge partitions that are appropriate for K. 
Therefore, by the fact that w is A-muliplicative and \K\ < £, 

^^^ lUg ,jnqM7 2Ug|J ng)) gy ior j e { i,f,h,c } . 

By fT2"]l. it follows that 

(Fi UK,HnQ) 
(V 2 UK,HnQ) 
(Vx UK,CnQ) 
(V 2 UK,CnQ) 

Now, letting r be 

w((yi UK, in Q))w((V 2 UK, in Q))w((Vi u K, F n Q))w((V 2 U K, F n Q)) 



= (V!UK,FnQ), 
= (V 2 UK,lnQ), 
= (Fi U K, I n Q) and 
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we obtain 



w((V,I))w((V,F)) p£ pi 
r " A ^ w((V, H))w((V,C)) ' 

whereby the lemma easily follows. □ 

Proof of Theorem^ Let (H,H') G fi x ft such that P(H,H') > 0. We will 
bound the expression within the max of the definition for g(r CT ). We let if = H 
if 7r(ff) < tt(H') and H = H' otherwise. Denote by cp(H,H') the set of pairs 
(I,F) such that (H,H r ) G j<t,i->f- We define the function inj : cp(H,H') — > ft 
by (I,F) — > / ffif ffl if. Observe that inj is an injection, for, given J G ft, 
we can determine the unique (f , F) such that inj(f , F) = J by first computing 
J (& H = I ® F and then using the ordering c to recover I and F. Since u> is 
A- multiplicative, we have by Lemma [5] that 

w((V,I))w((V,F)) <%u 



w((V,H))w((V,mj(I,F))) 



Regardless of whether ir(H) < tt(H') or w(H) > ir(H'), a brief calculation yields 
that n(H)P(H,H') = 7r(fJ)/(2m); thus, 



1 x 

tt(H)P(H, H' ) 

V ' y ' (I,F)ecp(H,H 

2m 



7r(I)7r(F)| 7(T 



^ 7r(7)7r(F)| 7CT ,7^F| 
7r '- H '' (FF)ecp{HM>) 



< 2m 2 ^ W ((y,f)) W ((y,F)) (M) 



(J,F)eep(H,H"') 

^|^) E u>((^,inj(f,F)))A« (15) 

< 2m 2 X 4l 1 (16) 

where JUJl follows from the facts It^j-^fI < m and 7r(S) = w((V, S))/V(G), 
(fTS")) follows from (fT5]l . and (| 16[) follows from the fact that inj is an injection. 
Then, substituting the bound (fT6| into (fT0|) . we obtain g(T a ) < 2m 2 \ u , as 
claimed. □ 



5 Vertex subset Glauber dynamics for bounded 
tree-width 

Until now, we had been considering edge subsets (subgraphs) and Glauber tran- 
sitions which change one edge at a time. In this section, we modify our meth- 
ods to treat vertex subsets (induced subgraphs) and transitions that involve 
one vertex at a time — each such transition can affect many edges, up to the 
maximum degree of G. We sketch how to obtain rapid mixing for this process 
upon graphs of bounded tree-width still with only a modest condition on the 
base graph polynomials. 
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A vertex subset expansion formula for V is written as follows: for any simple 
graph G=(V,E), 



V{G) = ]T ™(G{S}) (17) 
scv 

for some graph function w, where G[S] denotes the subgraph of G induced by S. 
If the function w is non-negative, we refer to (I17|) as an vertex subset weighting 
for V and to w as its weight function. Again, for our results to hold, aside 
from some other constraints, we need the weight function to be positive on all 
induced subgraphs. 

From the formulation in (|17[) . we define the single site flip chain M! on a 
given graph G = (V,E) as follows. We start with an arbitrary subset X$ C V 
and repeatedly generate Xt+i from X t by running the following experiment. 

1. Pick a vertex v £ V uniformly at random and let S = X t © {v}. 

2. Set X t+ i = S with probability \ min {1, w(G[S])/w(G[X t })} and with the 
remaining probability set X t +\ — X t . 

We denote the state space of M! by f2' (i.e. f2' = 2 V ) and its transition proba- 
bility matrix by P'. It can be shown that M! is a reversible Markov chain that 
has a unique stationary distribution ir' satisfying tt'(S) oc w(G[S]). Hence, we 
may use A4' as a Markov chain in MCMC sampling for the following problem. 

PWV('P): ^-weighted Vertex Subsets 
Input: a graph G = (V, E). 

Output: a subset SCV with probability w(G[S])/V(G). 

We now describe the condition required of the weight function w in (|17|). 
For fixed A > 0, we say that the weight function w is vertex X- multiplicative, 
if for any G — (V, E) and K a vertex cut that separates sets V\ and Vz with 
respect to G, we have 

j-\K\ < «1)«U^]) < m hR) 

A MG) " A ■ (18) 

Note that, if w is vertex A-multiplicative, then it follows that w is multiplicative 
with respect to disjoint union by taking K = 0; furthermore, taking Vi = 
gives that the addition of a few vertices does not change w wildly. 
The main result of this section is the following. 

Theorem 6. Let G — (V, E) where \V\ — n. If w is vertex X-multiplicative for 
some A > 0, then the mixing time of M! on G satisfies 



-(£) = O ( n 2+4(tw(G) + l)|logA| log (l/ E )) 



5.1 A sketch of the proof 

As before, we do not treat tree-width directly, but instead work with a differ- 
ent width parameter. For any graph G — (V, E), an ordering (v\, . . . , v n ) of V 
has vertex-separation at most i, if, for each i G {1, . . . ,n}, there are at most 
£ vertices in {v\, . . . , i>i_i} that are adjacent to a vertex in \yi, . . . , v n }. The 
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vertex-separation vs(G) of G = (V, E) is the smallest integer I such that there 
is an ordering of V with vertex-separation at most £. It was shown by Kinner- 
sley [UJ that the vertex-separation of G satisfies vs(G) = pw(G), and so the 
inequalities in (O remain relevant. 

To bound the mixing time of M! , we again follow a canonical paths argu- 
ment. Given G = (V, E), let a — (vi, . . . ,v n ) be an ordering of V. Given 
I, F G fl', let I © F denote the symmetric difference of I and F, let <r[I © F] := 
(vi 1 , . . . , Vi k ) denote the restriction of a to I ffi F (that is, {w^ , . . . , u ifc } = I@F 
and ii < ■•• < ife), and let ^ a ,i^F denote the canonical path from / to F, 
defined as 

7<tJ->f := (Hq, . . . , Hk), 

where H = I, Hj = H 3 -i © {v h } for all j G {1, . . . , k} (and hence H k = F). 
Let = {-faj^F \ I,F £ Q}. Using inequality (TTTT) . our bound on the mixing 
time again follows from a bound on the congestion g(T a ), which is defined 
analogously to ([TU1) . 

Theorem 7. Suppose G — (V, E) has vertex- separation £. Let a — (v\, . . . , v n ) 
be an ordering of V with vertex- separation at most I. If, for some X > 0, w is 
vertex X-multiplicative, then g(T a ) < In \ . 

Theorem [7] immediately implies a good mixing time bound for the Markov chain 
M! and hence Theorem [6] also. 

Corollary 8. Let G = (V,E) where \V\ = n. If w is vertex X-multiplicative for 
some X > 0, then the mixing time of M' on G satisfies 

t{s) = 0(n 2 A 4vs(G) log(l/e)) . 

Proof. Substitute the congestion bound in ((TTj) into Theorem [71 □ 

Proof of Theorem^ Substitute the upper bound on vs(G) = pw(G) in (jH]) into 
Corollary [8] □ 

We omit the proof of Theorem [JJ as it is similar to that of Theorem [21 but 
give the details for the analogue of Lemma [5] 

Lemma 9. Suppose G = (V, E) has vertex-separation I and let a = (v±, . . . , v n ) 
be an ordering ofV with vertex-separation at most £. Suppose I,F £ tt' and H 
is on 7o-.7_s.f- If w is vertex X-multiplicative for some X > 0, then 

w(G[I])w(G[F]) iu 
w{G[H])w{G[C]) ~ ' 

where C = I © F © H . 

Proof. Since H is on 7<jj^f, we may assume that H = Hj for some j G 
{0, ...,k}. Let Q = and Q = V \ Q. Then 

H = (FnQ) U (InQ) and G = (/ n Q) U (F n Q). (19) 

We can partition V into three sets as follows. Let V\ denote the set of vertices Q; 
let V2 denote the subset of Q containing vertices adjacent only to other vertices 
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of Q; and let K denote the set of remaining vertices, that is, the set of vertices 
of Q incident to vertices of V\. Note that \K\ is at most the vertex-separation 
I. 

Clearly, K is a vertex cut separating V\ and V2 with respect to G and also 
with respect to the graphs G[I], G[F], G[H], G[C]. Therefore, by the fact that 
to is vertex A-multiplicative, and noting that V% U K = Q, 

y-i < w{G[Q n J])w{G\Q n J]) < - f forJe{J)F)F)C} . 

By jpj|, it follows that H DQ = F nQ, H D Q_ = I n Q, C n Q = I f]Q and 
CnQ = i^nQ. Now, letting r = w(G[Qr\I])w(G[Qnl])w{G[QnF})w{G[QnF}), 
we obtain that 

w(G[I])w(G[F}) ~ 2£ ja 
whereby the lemma easily follows. □ 



5.2 An example of a vertex subset chain 

Recalling ©, for hxed x,y > 1, w(G[S]) := (2 - l) rk2 ( 5 )(y - i^l-'MS) gives 
a vertex subset weighting for g(G; x, y). With arguments very similar to those 
given in Subsection 12.21 it is not difficult to verify that this weight function is 
vertex A-multiplicative. So, by Theorem [BJ it follows that a natural Markov 
chain derived from the bivariate interlace polynomial — a chain which has not 
been studied extensively, as far as we are aware — mixes rapidly on tree- width- 
bounded graphs. 

Corollary 10. Let G = (V,E) where \V\ — n. If x, y > 1 are fixed, then for 
the single site flip chain on G associated with the weighting (O of q(G;x,y), 
the mixing time satisfies 




We believe that it would be of wider interest to study further properties of 
this single site flip chain on general graphs, in particular to compare it with 
known results on the random cluster, Potts and Ising models. 



6 Conclusion 

In this work, we have developed a new general framework of graph polynomials 
and Markov chains defined via subset expansion formulae for these polynomials, 
and demonstrated that their dynamics mix rapidly for graphs of bounded tree- 
width. On a graph G with n vertices, we have shown a mixing time of order 
j-jO(i) e O(pw(G)) _ n o(tw(G))^ Q ur resu it s apply to many of the most prominent 

and well-known polynomials in the field. The mixing times of our processes 
have, respectively, exponential and super-exponential dependencies upon path- 
width and tree-width. We ask whether this can be improved, in particular, to 
achieve something akin to fixed-parameter tractability in terms of tree-width. 
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